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1.  Introduction 


D.  Gabor  recognized  that  the  classical  interpretation  of  images  was  limited  in  the  sense  that  they 
were  viewed  as  a  collection  of  pixels  (spatial  domain)  or  a  sum  of  sinusoids  of  infinite  extent 
(spatial  frequency  domain).  These  were  only  the  extremes  of  a  joint  spatial/frequency 
representation  for  an  image  where  spatial  frequency  is  considered  a  local  phenomenon  and  varies 
with  position  in  Gabors  interpretation.  The  general  form  of  the  Gabor  filters  is  given  by  Zhou 
[4]  as; 

m  m 

with  Mq  =  cos(^)  and  ^ 

M  N  M  N 

Here  we  have  M  denoting  the  spatial  dimension,  m  denoting  the  resolution  level,  N  is  the  total 
number  of  orientations  and  n  denotes  the  preferred  orientation.  The  (Xq  ,7^)  is  the  position 

parameters  which  will  localize  the  region  in  visual  space. 

The  Gabor  filter  is  optimized  with  respect  to  spatial-frequency  representation.  The  proof  of  this 
can  be  found  by  considering  that  the  2-D  Gabor  filter  is  separable  (eq  (1))  and  considering  each 
1-D  counterpart,  the  1-D  Gabor  filters  achieves  the  minimal  uncertainty  product  [6]. 

It’s  interesting  that  psychological  and  physiological  studies  present  evidence  of  human  and 
mammalian  vision  supporting  some  spatial-frequency  analysis  that  maximizes  the  simultaneous 
localization  of  energy  in  both  spatial  and  local  frequency  domains.  The  application  of  a  suitable 
model  like  Gabor  Jets  in  facial  recognition  is  well  motivated  by  the  observation  that  some  low 
level,  spatial-frequency  efficient  processing  is  done  at  very  early  stages  of  development  [7-8]  and 
can  account  for  size,  shape  and  orientation  discrimination.  The  argument  is  supported  by 
indications  of  a  large  number  of  cells  in  the  primary  visual  cortex,  which  are  activated  by  the 
presence  of  a  particular  line  or  edge  of  specific  orientation,  size  and  position.  [9]  and  are 
modeled  by  the  “cortical  receptive  field  profiles”  or  model  which  is  a  class  of  Gabor  filters.  For 
our  application,  Gabor  Jets  are  simply  a  set  of  Gabor  coefficients  taken  from  one  fixed  image 
point  with  several  points  included  in  the  overall  set.  This  is  a  simplified  form  of  the  Gabor  Jet 
procedure  and  will  not  require  any  elastic  graph  matching  procedures  used  in  facial  recognition. 

Another  motivation  for  employing  Gabor  jets  as  a  post  processing  clutter  rejecter  is  attributed  to 
the  great  deal  of  research  in  facial  recognition,  invariant  shape  recognition  4,  directional  feature 
extraction  for  perceptual  grouping  [10]  and  image  retrieval  [11].  Wiskot  et  al  [12]  employed 
Gabor  Jets  in  a  facial  recognition  task  using  fiducial  points  of  the  face,  the  pupils,  corners  of  the 
mouth,  tip  of  nose  etc  in  creating  labeled  graphs.  They  further  generalized  these  facial  feature 
graphs  to  create  a  representative  set  of  model  graphs  called  a  face  bunch  graph  (FBG).  The  FBG 
lessoned  the  complexity  and  expense  of  covering  a  multitude  of  feature/facial  combinations.  We 
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site  Wiskots  [12]  and  Zhous  [4]  work  due  to  the  fact  that  a  connection  has  been  made  to  the  low- 
level  facial  recognition  abilities  of  a  developing  infant  and  Gabor  representation  as  a  suitable 
model  for  mammalian  vision.  We  believe  our  machine  vision  task  in  delineating  broadband,  low- 
resolution  IR  imagery  of  target  chips  post  detection  may  benefit  from  a  Gabor  application. 


2.  Back  Propagation 


A  simple  and  most  often  used  architecture  for  discrimination  is  the  back-propagation  neural 
network  (BPNN)  which  can  allow  for  separation  of  complex  hyperboundaries  in  feature  space 
depending  on  the  number  and  size  of  the  hidden  layers  12.  We  have  employed  the  BPNN  with  an 
adaptive  learning  rate  that  allows  fine-grain  adjustments  during  training.  Smoothing  is  also 
incorporated  and  allows  the  control  of  weight  adjustment  based  on  the  past  values  of  gradient 
descent  and  can  prevent  the  training  process  from  terminating  in  shallow  local  minimum.  For 
class  discrimination,  the  BPNN  can  provide  both  a  robust  classifier  and  a  measure  of  your 
confidence  in  the  classification  decision.  They  derive  their  computational  power  from  the 
parallel-distributed  structure  and  the  ability  to  learn  and  adapt.  We  trained  our  BPNN  to  a  fixed 
5%  false  alarm  rate  before  testing  and  validation.  False  alarm  rate  is  defined  as  the  number  of 
false  alarms  divided  by  the  total  number  of  clutter  chips.  Specifics  governing  the  neural  network 
training  flow  are  beyond  the  scope  of  this  paper.  We  will  present  several  results  using  the  BPNN 
for  our  clutter  rejection  task  in  later  sections  of  the  paper. 


3.  Database  for  Clutter  Rejection 


The  database  consisted  of  over  20,000  target  and  clutter  chips  extracted  manually  from  10-bit 
gray  scale  FLIR  imagery.  A  total  of  10  targets  were  included  in  the  database  with  chips  being  40 
X  75  pixels  in  size.  Ground-truth  information  was  used  to  center  the  silhouette  for  chip 
extraction  at  various  viewing  aspects.  Chips  termed  “region  of  interest”  or  ROl  were  also  used 
and  were  targets  in  less  favorable  conditions,  i.e.,  near  clutter  or  partially  obscured.  Figure  1 
shows  some  examples  of  the  various  chips  comparing  the  target  and  clutter  set. 
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Above  are  examples  of  signature  ehips  for  some  targets  in  the  data  set. 


Above  are  examples  of  “Region  of  interest”  ehips. 
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Above  are  examples  of  those  elutter  ehips  identified  as  targets  by  the  ARTM  deteetor. 


Above  are  examples  of  elutter  ehips. 


Figure  1 .  Several  examples  of  the  various  elasses  of  ehips  eonsidered  in  this  researeh. 


The  target  chips  were  extracted  from  the  original  input  frame  with  the  silhouettes  centered  within 
the  chip  and  scaled  to  a  constant  range  of  2  kilometers.  This  led  to  some  chips  being  down- 
sampled  and  others  being  up-sampled  form  the  original  but  the  information  of  the  chips  remained 
relatively  intact. 
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4.  Procedure  for  Gabor  Jets  and  Feature  Vectors 


Several  Gabor  Jet  feature  sets  ean  be  implemented  in  testing  for  elutter  rejeetion  whieh  depends 
on  ones  ehoiee  of  resolution  levels  and  orientation.  Examples  of  a  symmetrie  and  anti¬ 
symmetric  Gabor  filter  are  shown  in  figure  2.  Preliminary  experimentation  results  led  to 
adopting  Gaobr  Jet  feature  set  that  included  four  resolution  levels  and  six  orientations.  We  found 
that  employing  more  resolution  levels  and  orientations  gave  no  advantage  and  simpler  sets  like  a 
Gabor  Jet  feature  set  consisting  of  3  resolution  levels  with  4  orientations  did  not  perform  as  well. 


Figure  2.  Example  of  symmetric  and  anti-symmetric  Gabor  filter. 

For  all  Gabor  features  generated,  we  set  the  following  relationship  in  equation  1 ; 

a=p~-  .  (2) 

a 

with  c  being  a  constant  and  cr ,  the  standard  deviation  for  the  Gaussian  in  spatial  domain  x  and 
y.  This  eliminates  cross  terms  from  equation  1 . 

The  above  relationship  then  allows  the  expression  of  the  variance  in  x  or  y  spatial  domain  to  be; 

0-2  =M2  2-"'  .  (3) 

with  the  choice  of  constant  c  =  n  .  We  present  results  using  4  resolution  levels  and  6 
orientations  taken  at  5  selected  points  of  the  image  chip.  Our  selection  of  resolution  levels  is 
equivalent  to  4  different  variances  for  the  Gabor  filters  with  values  of  2 V2 , 2,  V2  and  1 . 

Orientations  at  each  resolution  level  are  0,  —  ,  — ,  — ,  and 

6  3  2  3  6 

When  considering  the  sample  of  5  fixed  points  for  each  chip  in  the  clutter  rejection  scheme,  this 
will  generate  a  feature  vector  of  120  features  total.  We  will  generate  a  standard  feature  set  that  is 
beneficial  for  the  various  sizes  of  targets.  We  will  look  at  small,  medium  and  elongated  feature 
sets  which  are  more  size  specific  and  an  extended  feature  set  that  includes  all  size  features. 
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The  standard  Gabor  Jet  feature  set  was  used  to  train,  test  and  validate  a  120  by  60  by  2  BPNN. 
Hit  rate  for  this  BPNN  architecture  trained  to  5%  FAR  was  90%  post  detection.  Hit  rate  is 
defined  as  the  number  of  target  hits  divided  by  the  total  number  of  target  chips  in  the  data  set. 
Examples  of  the  Gabor  processed  chips  with  the  points  used  for  standard  Gabor  jet  marked  in 
each  image  is  shown  in  figure  3.  We  also  created  size  specific  trained  BPNNs  and  fused  their 
outputs  as  a  second  procedure  using  the  standard  set.  See  Table  1  in  the  result  section  for  the 
results.  Here  we  need  to  briefly  mention  how  we  selected  size  specific  training,  testing  and 
validation  sets.  First,  we  employed  the  k-means  14  algorithm  to  determine  the  size  of  the 
silhouettes  for  various  aspect  angle  of  each  target.  Either  signature  (sig)  or  region  of  interest  (roi) 
target  silhouettes  were  used  to  generate  specific  sets  for  3  BPNNs  we  called  “small,”  “medium,” 
and  “elongated.”  We  only  used  k-means  to  enable  us  to  select  the  training  set  the  3  specific  nets 
and  not  in  determining  the  test  and  validation  sets.  K-means  results  were  also  utilized  as  a 
general  guideline  to  determine  the  spatial  placement  for  size  specific  features  as  seen  in  figure  4. 
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Terpel?  Elongated  Gstn' Jet  Feature 


Figure  4.  Small,  medium  and  elongated  Gabor  jet  processed  images  and  the  feature  locations. 

The  K-means  algorithm  employs  a  measure  based  on  the  minimization  of  the  sum  of  the  squared 
distances  from  all  points  in  a  cluster  domain  to  the  cluster  center. 

The  K-means  procedure  is  as  follows; 

Step  1.  Choose  the  initial  K  cluster  centers  z^{l),  ....,z,^{l). 

These  are  arbitrary  and  can  be  the  first  K  samples  of  the  data  set. 

Step  2.  At  the  k*  iteration,  distribute  the  samples  (x  }  among  the  K  cluster  domains, 
using  the  relationship, 

X  G  Sj{k)  if  I  X  -  Zj{k)  I  <  I  X  -  z.{k)  || 

for  all  i  =  j,  where  Sj(k)  denotes  the  set  of  samples  whose  cluster 

center  is  z.(k) .  Ties  can  be  resolved  arbitrarily. 

Step  3.  From  the  result  in  step  2,  compute  the  new  cluster  centers 
z  .(  k  +  \  ),  j  =  \,2,...,K,  such  that  the  sum  of  squared  distances  from  all  points  in  Sj(k) 

to  the  new  cluster  center  is  minimized.  Here,  the  new  cluster  center  is  computed  so  that 
the  performance  index 

X  £Sj(k) 

is  minimized.  This  Zj{k  +  l)  that  minimizes  this  performance  index  is  the  sample  mean 
of  Sj(k) .  The  new  cluster  center  is  given  by 

,  7  =  1,2,...,  A  where  Nj  is  the  number  of  samples  in  Sj{k). 

Nj  xGSj(k) 

Thus,  the  cluster  centers  are  sequentially  updated. 
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Step  4.  If  z j{k  +  \)  =  Zj{k)  for  j  =  ,  the  algorithm  has  converged  and  the 

procedure  is  terminated.  Otherwise,  return  to  step  2. 

Cluster  results  for  the  separation  of  the  sig/roi  sets  into  3  clusters  given  as  height,  width  were  for 
small  silhouettes  23,  36,  for  medium  silhouettes  26,  54  and  for  elongated  silhouettes  26,  64. 

Note  that  an  elongated  target  can  have  a  small  silhouette  when  viewed  head  on  or  from  behind  or 
at  angles  close  to  these  (0  and  180  degree).  The  silhouettes  size  were  the  basis  for  the  separation 
procedure  and  not  the  physical  size  of  the  target.  Figure  5  shows  examples  of  the  Full  Gabor  jet 
feature  set  combining  small,  medium  and  elongated  features  into  a  extended  feature  vector. 


Figure  5.  Examples  of  the  Full  Gabor  jet  feature  set  combining  small,  medium  and 
elongated  features  into  a  extended  feature  vector. 

Procedure  for  developing  the  BPNN  of  choice  was  “adhoc”  in  nature.  We  tried  several  differing 
architectures  using  either  1  or  2  hidden  layers  with  a  different  numbers  of  hidden  nodes  in  each 
layer.  This  led  to  a  selection  of  a  BPNN  with  50%  the  number  of  hidden  nodes  compared  to 
input  nodes  with  only  1  hidden  layer.  A  single  hidden  layer  BPNN  with  this  50%  rule 
performed  better  than  BPNN  with  multiple  hidden  layers  and  is  a  very  simple  architecture. 


5.  Results 


Table  1  has  the  results  for  each  Gabor  feature/BPNN  experiment  with  the  following  paragraphs 
explaining  each  entry  in  detail.  The  experiments  will  be  described  briefly  and  given  a 
corresponding  tag  for  a  table  of  results  that  list  the  test/validation  scores. 

The  standard  Gabor  feature  set  was  described  earlier  and  the  neural  network  architecture  had  120 
input  nodes,  60  hidden  nodes  and  2  output  nodes  (120x60x2). 
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Table  1.  Results  of  the  5  differing  architectures  and  Gabor 
feature  combinations  where  all  BPNNs  were  trained  to  a  5% 
FAR  rate  before  test  and  validation. 


BPNN/Gabor 

Hit  Rate  Test 

Hit  Rate  Validation 

STD 

90% 

91% 

STD-SEG 

88% 

89% 

SPC-SEG 

91.7% 

91% 

SPC-AEE 

95.5% 

95.5% 

STD-9 

95.1% 

96% 

The  BPNN  was  trained  to  a  false  alarm  rate  of  5%  and  the  n  hit  rate  was  determined  for  the  test 
and  validation  sets.  The  abbreviation  for  this  result  in  table  1  is  STD. 

A  standard  set  was  developed  for  segmented  size  specifie  neural  networks  using  the  k-means 
analysis  to  segment  the  training  database.  Here  we  have  3  BPNN,  each  is  120x60x2  and  the 
output  values  were  fused  to  determine  the  final  result.  Fusion  and  final  scoring  is  done  by  taking 
the  average  or  maximum  value  for  the  3  clutter  and  target  outputs  associated  with  small,  medium 
and  elongated  class  neural  nets,  ft  was  found  that  average  or  maximum  gave  the  same  result. 

The  abbreviation  for  this  result  is  STD-SEG  in  table  1 . 

A  specific  size  dependent  Gabor  feature  set  was  investigated  where  kmeans  analysis  allowed  one 
to  make  a  reasonable  choice  for  the  Gabor  feature  sets.  Segmentation  of  the  training  data  was 
also  performed.  Three  BPNN  were  developed  and  final  scoring  was  again  done  by  fusing  the 
neural  net  outputs.  The  abbreviation  for  this  result  is  SPC-SEG  in  table  1. 

A  fourth  experiment  was  to  allow  all  the  training  data  to  be  used  for  each  size  specific  BPNN 
and  size  specific  feature  set.  Here,  we  did  not  remove  target  silhouettes  that  were  larger  than  the 
dedicated  size  specific  feature  set.  Eor  example,  elongated  target  silhouettes  were  included  in 
training  the  “small”  NN  with  Gabor  features  chosen  for  small  targets.  Again,  fusion  of  output 
results  was  performed  and  we  use  the  abbreviation  SPC-AEE  in  table  1  for  the  respective  result. 

The  final  result  in  table  1  is  with  one  BPNN  instead  of  using  3  size  specific  BPNN.  Here,  we 
looked  at  including  all  the  size  specific  Gabor  feature  sets  to  train  the  single  BPNN.  This  created 
a  total  of  216  feature  inputs  for  the  neural  network  and  the  architecture  used  was  216  by  108  by 
2  using  the  50%  rule.  There  was  no  segmentation  of  training  done  in  this  case  and  the  resulting 
Gabor  feature  vectors  represent  an  expanded  standard  Gabor  set.  We  abbreviate  this  in  the  table 
1  as  STD-9. 
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6.  Conclusions 


All  indications  conclude  that  the  Gabor  Jets  features  capture  salient  characteristics  using  this 
simplified  approaeh.  The  standard  sets  performanee  was  not  improved  when  segmenting  the 
training  data  based  on  size  and  this  is  due  to  the  faet  that  often  those  ehips  which  were  eertainly 
target  and  segmented  as  “medium”  or  “elongated”  were  now  identified  as  clutter  when  presented 
to  the  “small”  BPNN.  “Elongated”  target  chips  were  to  a  lesser  degree  identified  as  elutter  to  the 
“medium”  BPNN.  When  we  introduced  the  size  specific  features  and  included  segmented  the 
training  we  had  an  improvement  in  hit  rate  over  the  standard  set  but  only  to  a  small  degree.  This 
is  an  interesting  result  since  one  would  believe  that  the  FAR  for  a  “small”  dedieated  BPNN  with 
size  speeific  features  would  be  affected  in  a  similar  manner  as  the  standard  set.  For  example, 
“elongated”  targets  being  misclassified  as  clutter  when  presented  to  the  “small”  BPNN.  The 
improvement  over  the  standard  set  may  be  due  to  the  fact  that  “small”  silhouettes  in  test  and 
validation  are  elassified  with  higher  confidenee  and  therefore  those  “small”  silhouettes  originally 
misclassified  are  now  correctly  classified  with  no  change  to  the  misclassifications  observed  with 
the  segmentation  and  standard  set.  In  removing  all  segmentation  of  the  training  data  but  having 
a  dedieated  BPNN  for  each  size  speeific  Gabor  feature  set  gave  considerable  improvement.  This 
arehitecture  though  is  very  complex  considering  you  have  3  parallel  BPNN’s  with  a  total  of  360 
features.  The  final  result  shows  that  the  single  BPNN  ean  generalize  13  and  perform  as  well  as  3 
separate  BPNN  and  that  the  power  of  the  clutter  rejeetion  lies  in  the  features.  This  arehitecture  is 
simpler  requiring  only  216  input  nodes  as  opposed  to  360  and  108  hidden  nodes  as  opposed  to 
180. 

Further  work  eould  involve  using  a  proeedure  to  estimate  the  silhouette  size  post  deteetion  and 
based  on  that  estimate,  determine  which  of  the  3  size  specifie  Gabor  feature  sets  to  use  in  the 
feature  extraction.  An  interesting  result  would  be  to  train  the  3  size  dedicated  BPNN  using  a 
new  rule  developed  from  the  automated  measurement  algorithm  for  the  training  silhouettes  and 
subsequent  k-means  analysis.  In  faet,  a  preliminary  “block  and  bound”  procedure  was  used  and 
after  k-means  clustering  the  resultant  clusters  were  similar  to  k-means  clusters  for  the  “by  hand” 
measurement  of  target  silhouettes  done  in  1997  which  we  used  in  our  experiments.  Seleeting  3 
clusters  for  small,  medium  and  elongated  and  using  the  “bloek  and  bound”  procedure,  the  k- 
mean  results  were  height,  width  for  small  23,  48  ,  for  medium  24,  56  and  for  elongated  24,  63. 
These  clusters  would  be  used  to  segment  the  training  data  base  and  provide  a  deeision  strueture 
to  determine  the  size  specific  Gabor  feature  set  as  eaeh  ehip  is  addressed.  Then,  one  would 
subject  the  test  and  validation  data  sets  to  the  measurement  procedure  and  in-turn  pass  the  size 
specifie  Gabor  features  to  the  dedieated  BPNN  with  final  fusion  of  the  output  for  scoring.  This 
is  a  complex  procedure  though  and  even  if  the  result  gains  a  14  %  in  clutter  rejeetion,  one  would 
doubt  that  it  is  more  beneficial  than  the  STD-9  approaeh. 
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